multidimensional time sery
Inferring the Most Similar Variable-length Subsequences between Multidimensional Time Series
Rattanakornphan, Thanadej, Charoenpoonpanich, Piyanon, Amornbunchornvej, Chainarong
Finding the most similar subsequences between two multidimensional time series has many applications: e.g. capturing dependency in stock market or discovering coordinated movement of baboons. Considering one pattern occurring in one time series, we might be wondering whether the same pattern occurs in another time series with some distortion that might have a different length. Nevertheless, to the best of our knowledge, there is no efficient framework that deals with this problem yet. In this work, we propose an algorithm that provides the exact solution of finding the most similar multidimensional subsequences between time series where there is a difference in length both between time series and between subsequences. The algorithm is built based on theoretical guarantee of correctness and efficiency. The result in simulation datasets illustrated that our approach not just only provided correct solution, but it also utilized running time only quarter of time compared against the baseline approaches. In real-world datasets, it extracted the most similar subsequences even faster (up to 20 times faster against baseline methods) and provided insights regarding the situation in stock market and following relations of multidimensional time series of baboon movement. Our approach can be used for any time series. The code and datasets of this work are provided for the public use.
Guaranteed Multidimensional Time Series Prediction via Deterministic Tensor Completion Theory
Shu, Hao, Li, Jicheng, Jin, Yu, Wang, Hailin
In recent years, the prediction of multidimensional time series data has become increasingly important due to its wide-ranging applications. Tensor-based prediction methods have gained attention for their ability to preserve the inherent structure of such data. However, existing approaches, such as tensor autoregression and tensor decomposition, often have consistently failed to provide clear assertions regarding the number of samples that can be exactly predicted. While matrix-based methods using nuclear norms address this limitation, their reliance on matrices limits accuracy and increases computational costs when handling multidimensional data. To overcome these challenges, we reformulate multidimensional time series prediction as a deterministic tensor completion problem and propose a novel theoretical framework. Specifically, we develop a deterministic tensor completion theory and introduce the Temporal Convolutional Tensor Nuclear Norm (TCTNN) model. By convolving the multidimensional time series along the temporal dimension and applying the tensor nuclear norm, our approach identifies the maximum forecast horizon for exact predictions. Additionally, TCTNN achieves superior performance in prediction accuracy and computational efficiency compared to existing methods across diverse real-world datasets, including climate temperature, network flow, and traffic ride data. Our implementation is publicly available at https://github.com/HaoShu2000/TCTNN.
Transforming Multidimensional Time Series into Interpretable Event Sequences for Advanced Data Mining
Yan, Xu, Jiang, Yaoting, Liu, Wenyi, Yi, Didi, Wei, Jianjun
This paper introduces a novel spatiotemporal feature representation model designed to address the limitations of traditional methods in multidimensional time series (MTS) analysis. The proposed approach converts MTS into one-dimensional sequences of spatially evolving events, preserving the complex coupling relationships between dimensions. By employing a variable-length tuple mining method, key spatiotemporal features are extracted, enhancing the interpretability and accuracy of time series analysis. Unlike conventional models, this unsupervised method does not rely on large training datasets, making it adaptable across different domains. Experimental results from motion sequence classification validate the model's superior performance in capturing intricate patterns within the data. The proposed framework has significant potential for applications across various fields, including backend services for monitoring and optimizing IT infrastructure, medical diagnosis through continuous patient monitoring and health trend analysis, and internet businesses for tracking user behavior and forecasting sales. This work offers a new theoretical foundation and technical support for advancing time series data mining and its practical applications in human behavior recognition and other domains.
Matrix Profile for Anomaly Detection on Multidimensional Time Series
Yeh, Chin-Chia Michael, Der, Audrey, Saini, Uday Singh, Lai, Vivian, Zheng, Yan, Wang, Junpeng, Dai, Xin, Zhuang, Zhongfang, Fan, Yujie, Chen, Huiyuan, Aboagye, Prince Osei, Wang, Liang, Zhang, Wei, Keogh, Eamonn
The Matrix Profile (MP), a versatile tool for time series data mining, has been shown effective in time series anomaly detection (TSAD). This paper delves into the problem of anomaly detection in multidimensional time series, a common occurrence in real-world applications. For instance, in a manufacturing factory, multiple sensors installed across the site collect time-varying data for analysis. The Matrix Profile, named for its role in profiling the matrix storing pairwise distance between subsequences of univariate time series, becomes complex in multidimensional scenarios. If the input univariate time series has n subsequences, the pairwise distance matrix is a n x n matrix. In a multidimensional time series with d dimensions, the pairwise distance information must be stored in a n x n x d tensor. In this paper, we first analyze different strategies for condensing this tensor into a profile vector. We then investigate the potential of extending the MP to efficiently find k-nearest neighbors for anomaly detection. Finally, we benchmark the multidimensional MP against 19 baseline methods on 119 multidimensional TSAD datasets. The experiments covers three learning setups: unsupervised, supervised, and semi-supervised. MP is the only method that consistently delivers high performance across all setups.
A method for recovery of multidimensional time series based on the detection of behavioral patterns and the use of autoencoders
This article presents a method for recovering missing values in multidimensional time series. The method combines neural network technologies and an algorithm for searching snippets (behavioral patterns of a time series). It includes the stages of data preprocessing, recognition and reconstruction, using convolutional and recurrent neural networks. Experiments have shown high accuracy of recovery and the advantage of the method over SOTA methods.
Sketching Multidimensional Time Series for Fast Discord Mining
Yeh, Chin-Chia Michael, Zheng, Yan, Pan, Menghai, Chen, Huiyuan, Zhuang, Zhongfang, Wang, Junpeng, Wang, Liang, Zhang, Wei, Phillips, Jeff M., Keogh, Eamonn
Time series discords are a useful primitive for time series anomaly detection, and the matrix profile is capable of capturing discord effectively. There exist many research efforts to improve the scalability of discord discovery with respect to the length of time series. However, there is surprisingly little work focused on reducing the time complexity of matrix profile computation associated with dimensionality of a multidimensional time series. In this work, we propose a sketch for discord mining among multi-dimensional time series. After an initial pre-processing of the sketch as fast as reading the data, the discord mining has runtime independent of the dimensionality of the original data. On several real world examples from water treatment and transportation, the proposed algorithm improves the throughput by at least an order of magnitude (50X) and only has minimal impact on the quality of the approximated solution. Additionally, the proposed method can handle the dynamic addition or deletion of dimensions inconsequential overhead. This allows a data analyst to consider "what-if" scenarios in real time while exploring the data.
High Dimensional Time Series Generators
Bachmann, Jörg P., Freytag, Johann-Christoph
Multidimensional time series are sequences of real valued vectors. They occur in different areas, for example handwritten characters, GPS tracking, and gestures of modern virtual reality motion controllers. Within these areas, a common task is to search for similar time series. Dynamic Time Warping (DTW) is a common distance function to compare two time series. The Edit Distance with Real Penalty (ERP) and the Dog Keeper Distance (DK) are two more distance functions on time series. Their behaviour has been analyzed on 1-dimensional time series. However, it is not easy to evaluate their behaviour in relation to growing dimensionality. For this reason we propose two new data synthesizers generating multidimensional time series. The first synthesizer extends the well known cylinder-bell-funnel (CBF) dataset to multidimensional time series. Here, each time series has an arbitrary type (cylinder, bell, or funnel) in each dimension, thus for $d$-dimensional time series there are $3^{d}$ different classes. The second synthesizer (RAM) creates time series with ideas adapted from Brownian motions which is a common model of movement in physics. Finally, we evaluate the applicability of a 1-nearest neighbor classifier using DTW on datasets generated by our synthesizers.